Skip to content

Conversation

mizvekov
Copy link
Collaborator

@mizvekov mizvekov commented Jul 30, 2025

Adds MSan sanitizer job.

This builds libc++, if the host is capable of it and for non-release jobs, and uses it instead of the system library.

This libc++ will be built with whatever sanitizer is selected, increasing the effectiveness of those tests.
This is also a requirement for MSan, which requires everything to be compiled with it.

Building the libc++ separately, with the host compiler, also decreases the maximum amount of disk space the CI jobs need.

In terms of host compiler, the libc++ being used requires [apple-]clang >= 19 or GCC 15. The other compilers are still built in the pre-existing manner, using the bootstrap build.

For now, GCC 15 doesn't use this yet, because there is some issue linking the library which needs to be investigated.

apple-clang, when built with sanitizers, also has a special requirement which is not being met, so is also not opting into the standalone libc++ build. This is left to improve later.

@mizvekov mizvekov self-assigned this Jul 30, 2025
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

1 similar comment
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@mizvekov mizvekov changed the title ci: add asan and msan Linux jobs ci: add msan Linux job Aug 15, 2025
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@mizvekov mizvekov force-pushed the ci-enable-msan branch 5 times, most recently from 204baa9 to 7cd0ed0 Compare September 3, 2025 17:22
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

1 similar comment
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@mizvekov mizvekov force-pushed the ci-enable-msan branch 2 times, most recently from 53b3346 to b39ca51 Compare September 9, 2025 19:23
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@mizvekov mizvekov force-pushed the ci-enable-msan branch 3 times, most recently from b39ca51 to 6235457 Compare September 10, 2025 01:42
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@mizvekov mizvekov marked this pull request as ready for review September 10, 2025 16:21
Copy link
Collaborator

@alandefreitas alandefreitas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to replicate this logic in bootstrap because I don't think anyone can replicate this themselves. So if there's an error in MSan everyone will be blocked from solving it locally.

@@ -66,21 +66,29 @@ jobs:
clang: git build-essential pkg-config python3 curl openjdk-11-jdk pkg-config libncurses-dev libxml2-utils libxml2-dev g++-14=14.2.0-4ubuntu2~24.04
msvc: ''
extra-values: |
llvm-hash: a1b6e7ff393533a5c4f3bdfd4efe5da106e2de2b
llvm-build-preset-prefix: {{#if optimized-debug}}debwithopt{{else}}{{{lowercase build-type}}}{{/if}}
use-libcxx: {{#if (and (ieq compiler 'clang') (or (eq major 19) (eq major 20) (eq major 21))) }}true{{else}}false{{/if}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang will be always tested with libcxx now? Even on linux? Even when unrelated to sanitizers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a ge (an "operator>=") handlebars helper?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find an operator equivalent to >=, it should be spelled 'ge' I think for consistency with naming of existing operators.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I felt that there was some value in using these on clang even without sanitizers, because at least the libcxx built here will be controlled by the same flags, so it will have debug info, will have runtime assertions if we choose to enable it, and so on. It would give some extra variation on top of what we already do on the GCC jobs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have it but we need to define it.

use-libcxx: {{#if (and (ieq compiler 'clang') (or (eq major 19) (eq major 20) (eq major 21))) }}true{{else}}false{{/if}}
libcxx-runtimes: libcxx{{#if (ne compiler 'msvc')}};libcxxabi{{/if}}
llvm-runtimes: {{#if (ine use-libcxx 'true') }}{{{ libcxx-runtimes }}}{{/if}}
llvm-path: {{#if (ieq compiler 'msvc') }}D:/a{{else if (ieq compiler 'apple-clang')}}/Users/runner/work{{else}}/__w{{/if}}/llvm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coding this is very unstable. Any new container will keep breaking this.

I know it's so we can -isystem{{{ llvm-path }}}/include/c++/v1{{/if}} later on, but the expression can still be evaluated at the job step because we're allowed to use more than one variable in the cmake workflow step or wherever you might want these flags.

In GHA actions, expressions are represented with ${{ ... }}, such as:

https://github.com/doxygen/doxygen/releases/download/Release_1_9_7/doxygen-1.9.7.{{ os }}${{ ( runner.os == 'Windows' && '.x64' ) || '' }}.bin.${{ ( runner.os == 'Windows' && 'zip' ) || 'tar.gz' }}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if cpp-actions supported expanding these '${{ ' expressions later, but that doesn't work ATM,
this string would end up used as-is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know hard-coding these paths is not ideal, but GitHub is not very helpful here, and the options to avoid this end up being much more complicated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if cpp-actions supported expanding these '${{ ' expressions later, but that doesn't work ATM,

Yes. This is impossible.

but GitHub is not very helpful here

The example above is how you expand that later. We use this solution all the time. Even mrdocs ci.yml uses it all the time, for instance:

${{ github.event_name == 'push' && (contains(fromJSON('["master", "develop"]'), github.ref_name) || startsWith(github.ref, 'refs/tags/')) }}

There's nothing wrong with this pattern to expand it later.

@@ -198,18 +206,15 @@ jobs:
with:
apt-get: ${{ matrix.install }}

- name: Resolve LLVM Root
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This was the step resolving it.

@@ -261,9 +266,44 @@ jobs:
echo "found=$found" >> $GITHUB_OUTPUT
fi

- name: Install libc++
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to include some comments here because this is very confusing. I undertand why this is being done, but it's very confusing to anyone new at this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... Now I'm not sure even I understand it. The "Install LLVM" step is building libc++ again, right? Like, every single time for all workflows. Is that correct? Isn't there a better way?

Copy link
Collaborator Author

@mizvekov mizvekov Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's not building it again. Either this step runs because we have 'use-libcxx' true, and then we don't build libc++ later using the LLVM bootstrap build, or this step is skipped because 'use-libcxx' is not true, but then we build libc++ on the bootstrap build later.

The libc++ on this step is built using the host compiler, and this can only be done with the compilers which are supported by it: [apple-]clang >= 19 and gcc >= 15.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By again, I mean later. This step is building it for the first time when use-libcxx is true, and that's clear. The problem is the "Install LLVM" step will always build it for a second time (if use-libcxx is true).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am saying the "Install LLVM" won't build it for a second time, because if 'use-libcxx' is true, then llvm-runtimes will be empty and we won't build any runtime component in that step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually crucial by the way, if we didn't build these things separately, we just wouldn't have disk space for the sanitizer jobs.

Copy link
Collaborator

@alandefreitas alandefreitas Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am saying the "Install LLVM" won't build it for a second time, because if 'use-libcxx' is true, then llvm-runtimes will be empty and we won't build any runtime component in that step.

OK. Please add some comments in the ci.yml file because the logic is quite hard to follow and remember.

We also need that logic replicated in the bootstrap. Maybe the libcxx step should always be separate there. I don't know.

@@ -882,25 +930,12 @@ jobs:
fi
echo "exists=$exists" >> $GITHUB_OUTPUT

- name: Resolve LLVM Root
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here...

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

MSan is viral, so it is added to Clang only.

Do not add MSan to MacOS job, since it is not supported in that platform.
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://950.mrdocs.prtest2.cppalliance.org/index.html

@alandefreitas
Copy link
Collaborator

We'll need to replicate this logic in bootstrap because I don't think anyone can replicate this themselves. So if there's an error in MSan everyone will be blocked from solving it locally.

@mizvekov
Copy link
Collaborator Author

We'll need to replicate this logic in bootstrap because I don't think anyone can replicate this themselves. So if there's an error in MSan everyone will be blocked from solving it locally.

I think we don't need to hold this PR for that and can do that in a follow up, WDYT?

@alandefreitas
Copy link
Collaborator

alandefreitas commented Sep 12, 2025

I think we don't need to hold this PR for that and can do that in a follow up, WDYT?

In summary, almost everyone is now using bootstrap.py, so there's no trade-off solution where waiting that brings us any value.

In more detail, having it in the bootstrap procedure is essential because I keep in touch with a lot of people, and I know for sure many other users and developers are already relying on this bootstrap script, even though I know you set things up differently. This is in part because I made it the official way to build it from source https://mrdocs.com/docs/mrdocs/install.html#mrdocs-bootstrap-script, and I don't think anyone has done the process manually since then. So, suppose someone opens a PR and it only fails with MSan. Now we have two options: all PRs that fail MSan get blocked, or this person gets annoyed and disables MSan so that doesn't happen. In both cases, the MSan tests became useless, and we don't want that. We want tests to be helpful.

We could say "let's do it later and let it be useless until we open the other PR fixing bootstrap.py". But there are two problems with this. The first problem is that if it's going to be useless until the other PR anyway, so waiting is not a problem. And the second problem is these things have to go together because I'm not qualified to do it, and my intuition is you'll never really do it. And my intuition is reasonable here because what seems to be blocking you right now is that you're not used to using the bootstrap script, and you don't want to go through the learning curve because you already had to set everything up manually. But as far as this PR is concerned, doing it in another PR won't solve this problem (it will hurt just as much). And as far as Mr.Docs is concerned, I would be extremely helpful if you got familiar with this script as soon as possible because this is now the official way of building things, you'd understand how people are building it, the tools we have available for other developers, and we could use lots of insights that only you could provide because you are just very talented at that. Also, because as part of these improvements in CI, we want to use the bootstrap script directly there, so we can be sure it's always correct.

Alternatively, as a simpler hypothesis, if this is just an anxiety issue because you've been working on this for a long time and want to see it merge, please don't worry about that. We all know that's a complex issue and that it takes time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants